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(54) DECODER DEVICE FOR ENCODED VOICE 

(57)Abstract: 

PURPOSE: To always obtain voice data having a satisfactory sound quality 
without being limited to a specific speaker and to make use of compression 
encoded voice data by using the same decoder device by making a constitution 
making parts becoming the feature of a speaker to be stored at the same time 
of storing voice data as to the encoding of the voice of the speaker. 
CONSTITUTION: A decoder device 1 is constituted of a central processing unit 
(a CPU), a data display part 2, a data input part 3, a data main storage part 4 
and a data external storage part 5. In the data external storage part 5, at least 
encoded voice data 6 and adaptive code book data 7 are stored. Then, voice 
data are encoded by a high efficiency encoding system while using the adaptive 
code book data 7. Next, in the case of storing encoded data in the storage part 
5, the adaptive code book used at the time of the encoding are also stored in 
the same storage device 5. Next, at the time of the data decodings, the decoder 
device 1 reads out the adaptive cod book part from the device 5 and then 
decodes data based on the readout part. 
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* NOTICES * 

JPO and NCIPI are not responsible for any 
damages caused by the use of this translation. 

1. This document has been translated by computer. So the translation may not reflect the original 
precisely. 

2. **** shows the word which can not be translated. 
3.1n the drawings, any words are not translated. 



CLAIMS 



[Claim(s)] 

[Claim 1] Decryption machine equipment of the coding voice characterized by having the store of the 
encoder voice data with which information, coding voice data, and the feature code of a speaker's voice 
were memorized, and decryption machine equipment which decrypts the above-mentioned coding voice 
data based on the above-mentioned feature code memorized by the above-mentioned store. 
[Claim 2] It is decryption machine equipment of the coding voice characterized by supporting each 
description of two or more speakers by whom said feature code is contained in said coding voice data in 
the store of coding voice according to claim 1, and the decryption machine equipment using this, and 
more than one being memorized by said store. 

[Claim 3] It is decryption machine equipment of claim 1 thru/or coding voice given in 2. Said feature 
code Divide into the frame of the time amount length which defined the input sound signal beforehand, 
and it outputs in quest of the spectrum parameter which shows the spectral envelope of said sound 
signal. Divide into the subframe of the time amount length which was able to define said frame 
beforehand, and in quest of a long-term prediction parameter, it outputs so that an error with said sound 
signal may become min from the past sound source. Decryption machine equipment of the coding voice 
characterized by what is indicated by the adaptation code book in the CELP voice coding approach 
which chooses the optimal code vector out of the code book beforehand prepared as a drive sound 
source for said every subframe. 
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DETAILED DESCRIPTION 



[Detailed Description of the Invention] 
[0001] 

[Industrial Application] With respect to the storage and decryption machine equipment of voice data 
which were encoded, when making this invention correspond to two or more speakers' voice especially, 
it relates to the storage and decryption machine equipment of suitable coding voice. 
[0002] 

[Description of the Prior Art] By progress of computer technology in recent years, the miniaturization of 
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equipment progressed and the throughput improved by leaps and bounds again. In connection with it, it 
is becoming possible to carry out multimedia processing of our information on surrounding. For example, 
the environment which the conventional text and the data with which graphics used voice and an image 
for dominant computer data can create easily is being improved. 

[0003] The scale of data is increasing by leaps and bounds with the above multimedia-izing. This is an 
inevitable conclusion for having DS with serial voice and image data. However, usually compression 
processing / coding processing of deleting a redundant signal about these analog data, rather than 
storing as it is with the gestalt which carried out digital conversion of them is performed. Since it 
becomes possible to reduce required storage capacity to 1 /dozens by this, the effectiveness that an 
equipment scale and cost can be reduced is accepted. Moreover, since compressed data are 
transmitted, effective use of the frequency at the time of transmission is attained. The high efficiency 
voice coding method of the transmission rate of 4 or less kbpses is especially developed just by voice 
data from a viewpoint of effective use of the above-mentioned frequency towards digitization of ** and 
a mobile radio communication link. 
[0004] 

[Problem(s) to be Solved by the Invention] By the way, as a general technical problem about coding and 
a decryption of data, it is most important how the HARASHIN number is not made distorted. Although it 
is not necessarily easy to make this distortion small to a signal with all possibility, it is a realistic 
problem to optimize so that that distortion may be made small about a characteristic signal. 
[0005] if coding which took the description of the voice of the specified speaker into consideration 
enough is possible when [ of a specified speaker ] applying high efficiency coding only within voice — 
more — high — a tone quality decryption sound is realizable. By the above high efficiency voice coding 
methods, the adaptation code book is actually adopted as structure which takes a speaker's description 
into consideration. 

[0006] Conventionally, the above-mentioned low bit rate coding method had not been developed for [, 
such as a mobile radio, ] real-time data, and the above adaptability had not necessarily been harnessed 
enough. 

[0007] Since a speaker can specify comparatively about the voice data of a package mold, it becomes 
possible to utilize adaptability. 

[0008] Even if the speaker is specified by each package data, considering the device side treating 
package mold voice data, in order to correspond to many kinds of package data, a speaker cannot regard 
it as unspecified. A linguistics learning machine is mentioned as an example. In this case, a speaker 
becomes unspecified on the character of the device of corresponding to two or more language. 
[0009] In the device treating the package mold voice data by which high efficiency coding was carried 
out, the purpose of this invention is to offer the configuration which pulls out the tone quality of the 
Hara voice of the data to the maximum extent, even if a speaker is whom. 

[0010] In applying high-efficiency-coding voice for the devices which treat the voice data of a package 
mold as mentioned above, even if the purpose of this invention is many and unspecified speakers' voice, 
it offers the store which stores data with a suitable configuration to obtain the decryption voice of good 
tone quality, and it is for above decryption machine equipment to enable correspondence to many 
package system data. 
[0011] 

[Means for Solving the Problem] In order to attain the above-mentioned purpose, in this invention, it 
considered as the configuration which makes voice data and coincidence memorize the part which 
serves as a speaker's description in coding of a speaker's voice. Specifically, voice data is first encoded 
using the low bit rate coding method using an adaptation code book. Next, when it stores the coded data 
in a store, the configuration of also storing in the same store the adaptation code book used at the time 
of coding is taken. Furthermore, at the time of a data decryption, decryption machine equipment read 
the adaptation code book part from the above-mentioned store, and considered it as the configuration 
of decrypting data based on it at it. 
[0012] 



-3- 



[Function] According to the low bit rate coding method using the above-mentioned adaptation code 
book, the compression voice data of not only a specified speaker but always good tone quality can be 
obtained. Furthermore, the adaptation code book data is stored in the store with compression voice data, 
compression voice data and adaptation code book data are picked out from the above-mentioned store, 
and a decryption machine decrypts them. Therefore, even if a decryption machine is the voice data by 
the unspecified speaker, it can reproduce the voice of the aforementioned good tone quality. 
[0013] 

[Example] One example of this invention is explained using a drawing below. The equipment block 
diagram of the decryption machine using the store of the coding voice by this invention is shown in 
drawing 1 . 

[0014] The decryption machine equipment 1 in an example consists of a central arithmetic unit (CPU), 
the data display section 2, the data input section 3, a data primary storage 4, and data external storage 
5. The liquid crystal display of about 5 inches of vertical angles is used for the data display section 2. 
The pressure-sensitive type touch panel and the easy push button type switch which were pasted up on 
the above-mentioned liquid crystal display are used for the data input section 3. As data external 
storage 5, card mold memory is used for the data primary storage 4 for ROM and RAM. 
[0015] Drawing 2 is an example of the contents of the data stored in the data external storage 5 which 
consists of 4MB of memory. The voice data 6 and the adaptation code book data 7 which were encoded 
at least are stored in external storage 5. The encoded voice data 6 is created by the below-mentioned 
high efficiency coding, and a transmission rate is 4kbps extent. Here, several 10 kB(s) were allotted to 
the voice data 6 for about 120 minutes 3.6MB and for the adaptation code book data 7. 
[0016] Drawing 3 is the block diagram of an encoder. This encoder was constituted based on the sign 
drive linear prediction (CELP) voice coding method, and is. The sound signal 101 by which A/D 
conversion was carried out with the predetermined sampling frequency (usually 8kHz) as audio original 
data is inputted. The weighted sum 114 which multiplied by them and added gains 112 and 1 13 as the 
long-term prediction vector 110 which is the output of the adaptation code book 108 as a component 
showing the periodicity of a sound source, and components other than periodicity (random nature and 
noise nature) is made into the drive sound source. 

[0017] Retrieval of the code book for acquiring the optimal drive sound source is the following, and is 
made and made. Although a drive sound source whose synthesized speech which inputs a drive sound 
source into a synthetic filter, and is obtained generally corresponds with the Hara voice (input voice) 
should just be acquired, it is accompanied by a certain error (quantumization noise) in fact. Therefore, 
what is necessary is just to determine that a drive sound source will minimize this error. It is common in 
that case to use the error which carried out weighting so that correspondence with human being's 
acoustic-sense property might become good. 

[0018] In order to evaluate this acoustic-sense weighting error, the drive sound source 1 14 is inputted 
into the weighting composition filter 105, and obtains the weighting synthesized speech 1 16. the input 
voice 101 — the acoustic-sense weighting filter 104 — letting it pass — the weighting input voice 115 
— obtaining — a difference with the weighting synthesized speech 116 — taking — a weighting error 
wave — 1 1 7 is obtained. In addition, the filter factor of the acoustic-sense weighting filter 104 and the 
weighting composition filter 105 is decided with the LPC parameter 103 which inputted the input voice 
101 into the LPC (linear prediction) analyzer 102 beforehand, and was obtained. 

[0019] Weighting error wave 117 has the square sum calculated over the error evaluation section in the 
square error count section 1 18, and the weighting square error 1 19 is acquired. As mentioned above, 
since a drive sound source is the weighted sum of a long-term prediction vector and a statistics code 
vector, the decision of a drive sound source results in the decision of the code vector index which 
decides which code vector to choose from each code book. Namely, what is necessary is to compute 
the weighting square error 1 19 by changing the long-term prediction lug 106 and the code vector index 
107 one by one, and just to choose that from which a weighting error serves as min in the error 
minimization section 120. Such a drive sound-source determining method is called "the analysis method 
by composition." 
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[0020] Thus, if the optimal drive sound source is determined, the long-term prediction lug 106, the code 
book index 107, gains 112 and 113, and the data 122 multiplexed in the multiplexing section 121 by 
making the LPC parameter 103 into a parameter are stored in external storage 5. Moreover, the 
condition of the adaptation code book 108 is updated using the drive sound source 1 14 at this time. 
Training of a code book is completed by repeating processing of the above-mentioned multiple times 
using the same speaker s voice. If the voice stored in external storage is two or more persons' voice, it 
cannot be overemphasized that training which used these two or more people's voice is required. Of 
course, voice data may be encoded using the adaptation code book after the above-mentioned training 
is completed. 

[0021] External storage 5 is made to memorize also about the data of the last condition of an adaptation 
code book here. It enables this to obtain the decryption voice of good tone quality with a decryption 
vessel which is described below. It is because the optimal drive sound source which took the description 
of a speaker proper into consideration is always used. 

[0022] The processing in a decryption machine is as having been shown in drawing 4 . The coded data 
222 first read from external storage is divided into various parameters in the demultiplexing section 221. 
The adaptation code book 208 is searched based on the long-term prediction lug 206, and the long-term 
prediction vector 210 is outputted. Moreover, the statistics code book 209 is searched based on the 
code book index 207, and the sound-source vector 211 is outputted. The long-term prediction vector 
210 and the sound-source vector 21 1 are multiplied by each gain 212 and 213, and it inputs into the 
synthetic filter 230 by making the added signal into the drive sound source 214 at them. The filter factor 
of a synthetic filter is decided with the LPC parameter 203. Although a postfilter is not indispensable, it 
is used abundantly in order to improve the subjective quality of synthesized speech, and the output 
serves as the output voice 232. Here, the data of an adaptation code book are used in what is stored in 
the data external storage 5, being called to the primary storage 4 of decryption machine equipment. 
[0023] In addition, in this invention, although the coding decryption method using an adaptation code 
book is the requisite, it cannot be overemphasized that it does not restrict to the above configurations. 
For example, you may be the configuration of an encoder like drawing 5 , and a decryption machine like 
drawing 6 . 

[0024] As for this configuration, the following points differ from the above-mentioned example. The 
pulse information code book other than an adaptation code book and a noise information code book is 
added as a sound source so that drawing 5 may see. Based on the acoustical description of the input 
from the sound classification section, i.e., input voice, the suitable code book supposes that it is 
selectable as an object of retrieval processing from the noise source and the pulse sound source. In this 
invention, it is essence that the description of a speaker's voice is recorded on storage with the coded 
data, and it is not influenced by the difference in the detailed algorithm coding and a decryption. 
[0025] Next, the example which applied the storage of the coding voice by this invention and the 
decryption machine equipment using this to the linguistics learning machine is shown. 10-15 of drawing 7 
are the external storage which stored the coding voice data of the language of six nations, and the 
adaptation code book data of each speaker A-F. decryption machine equipment — the above- 
mentioned external storage 10-15 — it can respond to all, and since it decrypts by calling each code 
book data, the voice decode which was adapted for those speakers' description can be performed. 
[0026] The above described the example which trains a single code book using each speaker s data to 
creation of the voice data with which two or more speakers are contained. The example which equips 
drawing 8 R> 8 with adaptation code book data at each of two or more speakers is shown. The interior 
of external storage 16 is equipped with three kinds of adaptation code books G, H, and I, and the 
speaker data Gd, Hd, and Id with which each differs are supported. As for the adaptation code books G, 
H. and I, training is finished with Speakers' Gd, Hd, and Id voice, respectively. 
[0027] If each combination is used when decrypting by reading data by the decryption machine 
equipment side, the optimal decryption voice can be obtained. 

[0028] In addition, this invention is not limited by the above-mentioned example. It is not what restricted 
the configuration of decryption machine equipment in the first place above. For example, as an input 
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device, it does not restrict to the combination of a touch panel and a push button type switch. 
Furthermore, as external storage, you may be an optical read-out type memory disk not only like card 
mold memory but CD-ROM. The interpreter machine for example, not only between a linguistics learning 
machine but different-species language may be used also about the second application, and a music 
regenerative apparatus is sufficient. 
[0029] 

[Effect of the Invention] According to this invention, there is the following effectiveness above. Since 
the code book data about the description of a speaker s voice memorized along with the voice data are 
also picked out from a store and are applied at the time of a decryption of the voice data created with 
the low bit rate coding method, the voice data of not only a specified speaker but always good tone 
quality can be obtained. Consequently, the activity of compression coding voice data can be aimed at 
using the same decryption machine equipment. 



[Translation done.] 



* NOTICES * 

JPO and NCIPI are not responsible for any 
damages caused by the use of this translation. 

1 .This document has been translated by computer. So the translation may not reflect the original 
precisely. 

2.**** shows the word which can not be translated. 
3.1n the drawings, any words are not translated. 



DESCRIPTION OF DRAWINGS 



[Brief Description of the Drawings] 

[Drawing 1] The block diagram of the decryption machine equipment using the store of the coding voice 
by this invention. 

[Drawing 2] Drawing for explaining the configuration of the external storage of the coding voice by this 
invention. 

[Drawing 3] The block diagram of an encoder of operation. 

[Drawing 4] The block diagram of a decryption machine of operation. 

[Drawing 5] The block diagram of another encoder of operation. 

[Drawing 6] The block diagram of another decryption machine of operation. 

[Drawing 7] Drawing for explaining the linguistics learning machine by this invention. 

[Drawing 8] Drawing for explaining the external storage by another configuration. 

[Description of Notations] 

1 [ — A data primary storage, 5 / — Data external storage, 6 / — Coding voice data. 7 / — Adaptation 
code book data. ] — Decryption machine equipment, 2 — The data display section, 3 — The data input 
section. 4 



[Translation done.] 
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